Term-list Translation Using Mono-lingual Word Co-occurrence Vectors
نویسنده
چکیده
A term-list is a list of content words that characterize a consistent text or a concept. This paper presents a new method for translating a term-list by using a corpus in the target language. The method rst retrieves alternative translations for each input word from a bilingual dictionary. It then determines the most`coherent' combination of alternative translations , where the coherence of a set of words is deened as the proximity among multi-dimensional vectors produced from the words on the basis of co-occurrence statistics. The method was applied to term-lists extracted from newspaper articles and achieved 81% translation accuracy for ambiguous words (i.e., words with multiple translations).
منابع مشابه
Term-list Translation using Mono-lingual Word Co-occurence Vectors
A term-list is a list of content words that characterize a consistent text or a concept. This paper presents a new method for translating a term-list by using a corpus in the target language. The method first retrieves alternative translations for each input word from a bilingual dictionary. It then determines the most 'coherent' combination of alternative translations, where the coherence of a...
متن کاملTrans-EZ at NTCIR-2 : Synset Co-occurrence Method for English-Chinese Cross-Lingual Information Retrieval
In this paper, a new method for English-Chinese cross-lingual information retrieval is proposed and evaluated in NTCIR-II project. We use the bilingual resources and contextual information to deal with the word sense disambiguation (WSD) and translation disambiguation for query translation. An EnglishChinese WordNet and a synset co-occurrence model are adopted to solve the problem of word sense...
متن کاملQuery Term Disambiguation Using Co-occurrence Statistics for Dictionary based Cross Lingual Information Retrieval
Query translation in cross lingual information retrieval can be done using machine translation, parallel corpora or machine readable dictionary. The technique which is most cost effective and less time consuming wins the major votes. Working on this line many researchers opt for machine readable dictionaries which are easily available. Dictionaries usually provide more than one translations in ...
متن کاملCUNY-UIUC-SRI TAC-KBP2011 Entity Linking System Description
In this paper we describe a joint effort by the City University of New York (CUNY), University of Illinois at Urbana-Champaign (UIUC) and SRI International at participating in the mono-lingual entity linking (MLEL) and cross-lingual entity linking (CLEL) tasks for the NIST Text Analysis Conference (TAC) Knowledge Base Population (KBP2011) track. The MLEL system is based on a simple combination ...
متن کاملCO-graph: A new graph-based technique for cross-lingual word sense disambiguation
In this paper, we present a new method based on co-occurrence graphs for performing Cross-Lingual Word Sense Disambiguation (CLWSD). The proposed approach comprises the automatic generation of bilingual dictionaries, and a new technique for the construction of a co-occurrence graph used to select the most suitable translations from the dictionary. Different algorithms that combine both the dict...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998